Search CORE

35 research outputs found

Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making

Author: Bellamy Rachel K. E.
Liao Q. Vera
Zhang Yunfeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/01/2020
Field of study

Today, AI is being increasingly used to help human experts make decisions in high-stakes scenarios. In these scenarios, full automation is often undesirable, not only due to the significance of the outcome, but also because human experts can draw on their domain knowledge complementary to the model's to ensure task success. We refer to these scenarios as AI-assisted decision making, where the individual strengths of the human and the AI come together to optimize the joint decision outcome. A key to their success is to appropriately \textit{calibrate} human trust in the AI on a case-by-case basis; knowing when to trust or distrust the AI allows the human expert to appropriately apply their knowledge, improving decision outcomes in cases where the model is likely to perform poorly. This research conducts a case study of AI-assisted decision making in which humans and AI have comparable performance alone, and explores whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI. Specifically, we study the effect of showing confidence score and local explanation for a particular prediction. Through two human experiments, we show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making, which may also depend on whether the human can bring in enough unique knowledge to complement the AI's errors. We also highlight the problems in using local explanation for AI-assisted decision making scenarios and invite the research community to explore new approaches to explainability for calibrating human trust in AI

arXiv.org e-Print Archive

Crossref

Visualizations for an Explainable Planning Agent

Author: Bellamy Rachel K. E.
Chakraborti Tathagata
Dholakia Mishal
Fadnis Kshitij P.
Kephart Jeffrey O.
Srivastava Biplav
Talamadupula Kartik
Publication venue
Publication date: 08/02/2018
Field of study

In this paper, we report on the visualization capabilities of an Explainable AI Planning (XAIP) agent that can support human in the loop decision making. Imposing transparency and explainability requirements on such agents is especially important in order to establish trust and common ground with the end-to-end automated planning system. Visualizing the agent's internal decision-making processes is a crucial step towards achieving this. This may include externalizing the "brain" of the agent -- starting from its sensory inputs, to progressively higher order decisions made by it in order to drive its planning components. We also show how the planner can bootstrap on the latest techniques in explainable planning to cast plan visualization as a plan explanation problem, and thus provide concise model-based visualization of its plans. We demonstrate these functionalities in the context of the automated planning components of a smart assistant in an instrumented meeting space.Comment: PREVIOUSLY Mr. Jones -- Towards a Proactive Smart Room Orchestrator (appeared in AAAI 2017 Fall Symposium on Human-Agent Groups

arXiv.org e-Print Archive

Crossref

Bootstrapping Conversational Agents With Weak Supervision

Author: Bellamy Rachel K. E.
Desmarais Chris
Gupta Ayush
Gurusankar Manikandan
Ho Tin Kam
Liao Q. Vera
Mallinar Neil
McGregor Blake
Shah Abhishek
Ugrani Rajendra
Yates Robert
Zhang Yunfeng
Publication venue
Publication date: 14/12/2018
Field of study

Many conversational agents in the market today follow a standard bot development framework which requires training intent classifiers to recognize user input. The need to create a proper set of training examples is often the bottleneck in the development process. In many occasions agent developers have access to historical chat logs that can provide a good quantity as well as coverage of training examples. However, the cost of labeling them with tens to hundreds of intents often prohibits taking full advantage of these chat logs. In this paper, we present a framework called \textit{search, label, and propagate} (SLP) for bootstrapping intents from existing chat logs using weak supervision. The framework reduces hours to days of labeling effort down to minutes of work by using a search engine to find examples, then relies on a data programming approach to automatically expand the labels. We report on a user study that shows positive user feedback for this new approach to build conversational agents, and demonstrates the effectiveness of using data programming for auto-labeling. While the system is developed for training conversational agents, the framework has broader application in significantly reducing labeling effort for training text classifiers.Comment: 6 pages, 3 figures, 1 table, Accepted for publication in IAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Adult Social Work and High Risk Domestic Violence Cases

Author: Blood I.
Claire Bellamy
Concetta Banks
Cresswell J. W.
Debbie Thackray
Department of Health (DH)
Flatley J.
FRA (European Union agency for Fundamental Rights)
Harne L.
Home Office
Home Office
Home Office
House of Commons Home Affairs Committee
Howarth E.
Hugh McLaughlin
Lazenbatt A.
LGA
Pritchard J.
Punch K. F.
Rachel Robbins
Scott M.
Sheldon B.
World Health Organization (WHO)
Publication venue: 'SAGE Publications'
Publication date: 01/05/2018
Field of study

Summary This article focuses on adult social work’s response in England to high-risk domestic violence cases and the role of adult social workers in Multi-Agency Risk and Assessment Conferences. (MARACs). The research was undertaken between 2013-2014 and focused on one city in England and involved the research team attending MARACs, Interviews with 20 adult social workers, 24 MARAC attendees, 14 adult service users at time T1 (including follow up interviews after six months, T2), focus groups with IDVAs and Women’s Aid and an interview with a Women’s Aid service user. Findings The findings suggest that although adult social workers accept the need to be involved in domestic violence cases they are uncertain of what their role is and are confused with the need to operate a parallel domestic violence and adult safeguarding approach, which is further, complicated by issues of mental capacity. MARACS are identified as overburdened, under-represented meetings staffed by committed managers. However, they are in danger of becoming managerial processes neglecting the service users they are meant to protect. Applications The article argues for a re-engagement of adult social workers with domestic violence that has increasingly become over identified with child protection. It also raises the issue whether MARACS remain fit for purpose and whether they still represent the best possible response to multi-agency coordination and practice in domestic violence

CLoK

Crossref

E-space: Manchester Metropolitan University's Research Repository

Global constitutionalism, responsibility to protect, and extra-territorial obligations to realize the right to health: time to overcome the double standard (once again)

Author: A Amouzou
A Costello
A Khalfan
A Peters
A Sen
AJ Bellamy
DP Forsythe
E Marseille
G Ooms
G Ooms
Gorik Ooms
H Fürher
H Shue
I Cismas
J Frenk
J Tobin
JL Cohen
K Mahbubani
M Cranston
M Foster
M Mamdani
M Mutua
Rachel Hammonds
RJ Vincent
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

The Knee Clinical Assessment Study – CAS(K). A prospective study of knee pain and knee osteoarthritis in the general population: baseline recruitment and retention at 18 months

Author: A Silman
AS Zigmond
C Jinks
C Jinks
CA Moser
E Chen
E Thomas
Elaine Hay
Elaine Thomas
F Wolfe
G Peat
George Peat
GM Van Dijk
Helen Myers
JE Ware Jr
Jonathan Hill
June Handy
K Slim
KM Dunn
Krysia Dziedzic
KS Khan
L Sharma
Laurence Wood
M Szklo
M Urwin
MA Hernán
MC Hochberg
N Bellamy
P Edwards
Peter Croft
PG Surtees
Rachel Duncan
Rosie Lacey
Ross Wilkie
S Cobb
S Downs
S Menard
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Selective non-participation at baseline (due to non-response and non-consent) and loss to follow-up are important concerns for longitudinal observational research. We investigated these matters in the context of baseline recruitment and retention at 18 months of participants for a prospective observational cohort study of knee pain and knee osteoarthritis in the general population. METHODS: Participants were recruited to the Knee Clinical Assessment Study – CAS(K) – by a multi-stage process involving response to two postal questionnaires, consent to further contact and medical record review (optional), and attendance at a research clinic. Follow-up at 18-months was by postal questionnaire. The characteristics of responders/consenters were described for each stage in the recruitment process to identify patterns of selective non-participation and loss to follow-up. The external validity of findings from the clinic attenders was tested by comparing the distribution of WOMAC scores and the association between physical function and obesity with the same parameters measured directly in the target population as whole. RESULTS: 3106 adults aged 50 years and over reporting knee pain in the previous 12 months were identified from the first baseline questionnaire. Of these, 819 consented to further contact, responded to the second questionnaire, and attended the research clinics. 776 were successfully followed up at 18 months. There was evidence of selective non-participation during recruitment (aged 80 years and over, lower socioeconomic group, currently in employment, experiencing anxiety or depression, brief episode of knee pain within the previous year). This did not cause significant bias in either the distribution of WOMAC scores or the association between physical function and obesity. CONCLUSION: Despite recruiting a minority of the target population to the research clinics and some evidence of selective non-participation, this appears not to have resulted in significant bias of cross-sectional estimates. The main effect of non-participation in the current cohort is likely to be a loss of precision in stratum-specific estimates e.g. in those aged 80 years and over. The subgroup of individuals who attended the research clinics and who make up the CAS(K) cohort can be used to accurately estimate parameters in the reference population as a whole. The potential for selection bias, however, remains an important consideration in each subsequent analysis

Keele Research Repository

Crossref

Springer - Publisher Connector

PubMed Central